Neural Networks
○ Elsevier BV
Preprints posted in the last 30 days, ranked by how well they match Neural Networks's content profile, based on 32 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.
Sun, G.; Huang, N.; Yan, H.; Zhou, J.; Li, Q.; Lei, B.; Zhong, Y.; Wang, L.
Show abstract
Generalization is a fundamental criterion for evaluating learning effectiveness, a domain where biological intelligence excels yet artificial intelligence continues to face challenges. In biological learning and memory, the well-documented spacing effect shows that appropriately spaced intervals between learning trials can significantly improve behavioral performance. While multiple theories have been proposed to explain its underlying mechanisms, one compelling hypothesis is that spaced training promotes integration of input and innate variations, thereby enhancing generalization to novel but related scenarios. Here we examine this hypothesis by introducing a bio-inspired spacing effect into artificial neural networks, integrating input and innate variations across spaced intervals at the neuronal, synaptic, and network levels. These spaced ensemble strategies yield significant performance gains across various benchmark datasets and network architectures. Biological experiments on Drosophila further validate the complementary effect of appropriate variations and spaced intervals in improving generalization, which together reveal a convergent computational principle shared by biological learning and machine learning.
Hassanejad Nazir, A.; Hellgren Kotaleski, J.; Liljenström, H.
Show abstract
As social beings, humans make decisions partly based on social interaction. Observing the behavior of others can lead to learning from and about them, potentially increasing trust and prompting trust-based behavioral changes. Observation-based decision making involves different neural structures. The orbitofrontal cortex (OFC) and lateral prefrontal cortex (LPFC) are known as neural structures mainly involved in processing emotional and cognitive decision values, respectively, while the anterior cingulate cortex (ACC) plays a pivotal role as a social hub, integrating the afferent expectancy signals from OFC and LPFC. This paper presents a neurocomputational model of the interplay between observational learning and trust, as well as their role in individual decision-making. Our model elucidates and predicts the emotional and rational behavioral changes of an individual influenced by observing the action-outcome association of an alleged expert. We have modeled the neurodynamics of three cortical structures (OFC, LPFC, and ACC) and their interactions, where the neural oscillatory properties, modeled with Dynamic Bayesian Probability, represent the observers attitude towards the expert and the decision options. As an example of an everyday behavioral situation related to climate change, we use the choice of transportation between home and work. The EEG-like simulation outputs from our model represent the presumed brain activity of an individual making such a choice, assuming the decision-maker is exposed to social information.
Hennig, J. A.; Burrell, M.; Uchida, N. A.; Gershman, S. J.
Show abstract
Animals exposed to pairings of a neutral stimulus with reward acquire a conditioned response to the neutral stimulus. A prominent hypothesis, formalized in the Temporal Difference (TD) learning algorithm, is that animals learn to predict the future reward associated with the neutral stimulus ("value"). Though the TD algorithm does not explicitly specify what drives conditioned responding, a typical assumption is that it reflects the animals estimate of value. In TD learning, value estimates are updated using reward prediction error (RPE, the discrepancy between observed and predicted reward), and are thought to be signaled by the phasic activity of midbrain dopamine neurons. This hypothesis posits that dopamines effects on conditioned responding are mediated entirely by its effects on learning. However, recent experimental and theoretical evidence suggests that dopamine may play a more direct role in modulating conditioned responding. We use a combination of data analysis and computational modeling to probe the relationship between dopamine and conditioned responding. Our results suggest that dopamine directly modulates conditioned responding, in addition to its role in learning. These findings can be captured by a model in which dopamine RPE acts both indirectly (via learning) and directly on conditioned responding.
Lorenzi, R. M.; De Grazia, M.; Gandini Wheeler-Kingshott, C. A. M.; Palesi, F.; D'Angelo, E. U.; Casellato, C.
Show abstract
A mean field model (MFM) is a mesoscopic description of neuronal population dynamics that can reduce the complexity of neural microcircuits into equations preserving key functional properties. The generation of a MFM is a complex mathematical process that starts with the incorporation of single neuron input/output relationships and local connectivity. Once neuron electroresponsiveness and synaptic properties are defined, in principle, the process can be automatized. Here we develop a tool for automatic MFM derivation from biophysically grounded spiking networks (Auto-MFM) by performing micro-to-mesoscale parameter remapping, estimating input/output relationships specific for different neuronal populations (i.e., transfer functions), and optimizing transfer function parameters. Auto-MFM was tested using a spiking cerebellar circuit as a generative model. The cerebellar MFM derived with Auto-MFM accurately reproduced cerebellar population dynamics of the corresponding spiking network, matching mean and time-varying firing rates across a wide range of stimulation patterns. Auto-MFM allowed us to model and explore physiological and pathological circuit variants; indeed, it was used to map ataxia-related structural connectivity alterations of the cerebellar network, in which Purkinje cells with simplified dendritic structure altered the cerebellar connectivity. Furthermore, Auto-MFM was used to create a library of cerebellar MFMs by sweeping the level of the excitatory conductance at mossy fiber - granule cell synapse, which is altered in several neuropathologies. Auto-MFM is thus proving a flexible and powerful tool to generate region-specific MFMs of healthy and pathological brain networks to be embedded in brain digital models.
Geminiani, A.; Meier, J. M.; Perdikis, D.; Ouertani, S.; Casellato, C.; Ritter, P.; D'Angelo, E. U.
Show abstract
The impact of cellular activities on large-scale brain dynamics is thought to determine brain functioning and disease, yet the causal relationships of neural mechanisms across scales remain unclear. Recently, the cerebellum has been reported to affect whole-brain dynamics during sensorimotor integration. To disclose the underlying mechanisms, we have developed a multiscale digital brain co-simulator, in which a spiking neural network of the olivo-cerebellar microcircuit is embedded in a mouse virtual brain and wired with other nodes using an atlas-based long-range connectome. Parameters and bi-directional interfaces between the spiking olivo-cerebellar network and other rate-coded modules were tuned to match experimental data of primary sensory and motor cortex (M1 and S1) power spectral densities and neuronal spiking rates. Then, the role of the cerebellar circuitry on sensorimotor integration was analyzed by lesioning critical circuit connections in silico. Simulations showed that spike processing within the cerebellar circuit is key to explaining the gamma-band coherence between M1 and S1 during sensorimotor integration. These results provide a mechanistic explanation of how the cerebellum promotes the formation of sensorimotor contingencies in relevant cortical modules as the basis of its critical role in sensorimotor prediction. On a broader perspective, this modelling approach opens new perspectives for the multiscale investigation of brain physiological and pathological states in relation to specific cellular and microcircuit properties.
Raval, V.; Oaks-Leaf, R.; Chen, Q.; Rieke, F.
Show abstract
Receptive fields provide a concise description of the stimulus selectivity of visual neurons. But this stimulus selectivity is neither static nor linear, and these nonlinear effects are not well captured by standard linear or pseudo-linear receptive field models. At the same time, receptive field models incorporating nonlinear effects are largely empirical, and are not easily interpreted in terms of underlying cellular and synaptic mechanisms. Here we show that two nonlinear mechanisms in the primate outer retina shape neural responses and that these contribute significantly to responses to natural stimuli and to the retinal output signals. Incorporating these outer retinal nonlinearities into models for visual function will improve our ability to identify the mechanistic origin of specific features of downstream visual responses.
Kula, B.; Chen, T.-J.; Nagy, B.; Hovhannisyan, A.; Terman, D.; Sun, W.; Kukley, M.
Show abstract
Glutamatergic neuronal synapses in the mouse neocortex mature during the first two months after birth. A key event during synaptic maturation is a change in short-term synaptic plasticity (STP), i.e. a switch from strong synaptic depression to a weaker depression or even facilitation. Glutamatergic pyramidal neurons located in the cortical layers II/III, layer V, and layer VI project axons through the corpus callosum where they release glutamate along their shafts and form glutamatergic synapses with oligodendrocyte precursor cells (OPCs). Here, we used single-cell electrophysiological recordings in brain slices to investigate synaptic plasticity at neuron-OPC synapses along axonal shafts in the white matter, and applied computation approaches to pinpoint the mechanisms of this plasticity. We found that during postnatal development of mice, there is a switch from short-term synaptic depression to short-term synaptic facilitation at glutamatergic neuron-OPC synapses in the corpus callosum. Synaptic delay of phasic neuron-OPC excitatory postsynaptic current shortens, and the amount of asynchronous release at neuron-OPC synapses decrease as animals mature, indicating that glutamate release becomes more synchronized. Our computational modelling suggests that both pre- and postsynaptic changes may contribute to the functional development and changes of plasticity at neuron-OPC synapses in the white matter. Taking together, our findings indicate that synaptic release machineries located at different sites along the same axon (i.e. axonal shaft in the white matter vs synaptic boutons in the grey matter) mature in a very similar fashion, STP occurs at both synaptic sites, and STP dynamics represent an important event during brain maturation.
Zhang, Y.; Zhang, J.; Zang, Y.
Show abstract
Detecting novel stimuli is a fundamental neural function, yet its machine learning counterpart--out-of-distribution (OOD) detection--remains challenging, with models often making overconfident predictions on unseen inputs. Inspired by the strong pattern-separation capabilities of cerebellum-like circuits, we introduce a cerebellum-inspired kernel with an efficient closed-form implementation. Combining random Gaussian projection with Top-k sparsification, the kernel reshapes similarities in high-dimensional space to enhance separability between in-distribution (ID) and OOD samples. On OpenOOD benchmarks, our kernel consistently improves multiple baseline methods, and pairing it with the energy score achieves performance comparable to or exceeding current state-of-the-art approaches. The closed-form design also avoids the high computational cost of large-expansion explicit mapping. These results demonstrate the generality and potential of cerebellar kernels for OOD detection and other tasks requiring efficient pattern separation under limited computational resources.
Gargano, J. A.; Rice, A.; Chari, D. A.; Parrell, B.; Lammert, A. C.
Show abstract
Reverse correlation is a widely-used and well-established method for probing latent perceptual representations in which subjects render subjective preference responses to ambiguous stimuli. Stimuli are purposefully designed to have no direct relationship with the target representation (e.g., they are randomly-generated), a property which makes each individual stimulus minimally informative toward reconstructing the target, and often difficult to interpret for subjects. As a result, a large number of stimulus-response pairs must be gathered from a given subject in order for reconstructions to be of sufficient quality, making the task fatiguing. Recent work has demonstrated that the number of trials needed can be substantially reduced using a compressive sensing framework that incorporates the assumption that the target representation can be sparsely represented in some basis into the reconstruction process. Here, we introduce an alternative method that incorporates the sparsity assumption directly into stimulus generation, which holds promise not only for improving efficiency, but also for improving the interpretability of stimuli from subjects perspective. We develop this new method as a mathematical variation of the compressive sensing approach, before conducting one simulation study and two human subjects experiments to assess the benefits of this method to reconstruction quality, sample size efficiency, and subjective interpretability. Results show that sparse stimulus generation improves all three of these areas relative to conventional reverse correlation approaches, and also relative to compressive sensing in most conditions.
Liu, P.; Bo, K.; Chen, Y.; Keil, A.; Ding, M.; Fang, R.
Show abstract
Emotion reshapes perception by modulating sensory processing through top-down feedback--a process referred to as emotional perception. The computational mechanisms by which distinct affective signals influence visual representations however remain poorly understood. Here, we use a deep neural network to simulate this process and test mechanistic hypotheses about how top-down feedback guides emotional peception. Most existing models treat the perception of emotional content as a static, feedforward task, overlooking the dynamic interplay between internal states, external goals, and sensory input that characterizes affective perception in the brain. We introduce EmoFB, a biologically inspired model that integrates an affective system with a visual processing hierarchy through two functionally distinct feedback signals: intrinsic feedback, arising from the models own affective appraisal of perceptual input, and external steering, conveying contextual priors such as task expectations or target categories. We evaluated EmoFB on three tasks varying in perceptual ambiguity (Single Image, Side-by-Side, and Overlay). External steering exerted the strongest influence, not only improving recognition under challenging conditions but also restructuring internal representations by sharpening category-specific clustering in feature space. Crucially, top-down feedback increased brain-model representational similarity, strengthening alignment with human fMRI responses across early visual cortex, ventral visual areas, and the amygdala. EmoFB provides a computational framework for testing neurocognitive theories of emotion appraisal and top-down feedback modulation. It bridges affective neuroscience and artificial intelligence, offering mechanistic insight into how emotional signals shape perception in both brains and machines.
Gupta, R.; Karmeshu, ; Singh, R. K. B.
Show abstract
Voltage perturbations to a repetitively firing Hodgkin-Huxley (HH) model of neuronal spiking in the bistable regime with coexisting limit cycle and stable steady node can either lead to the spikes phase resetting or collapse to the stable steady state. The latter describes a non-firing hyperpolarized quiescent state of the neuron despite the presence of constant external current. Using asymptotic phase response curve (PRC), the impact of voltage perturbations on a repetitively firing HH model is studied here while it is diffusively coupled to another HH model under identical external stimulation. It is observed that the pre-perturbation state of synchronization and the coupling strength critically determine the PRC response of the perturbed HH dynamics. Higher coupling strengths of perfectly in-phase (anti-phase) synchronized HH models shrink (expand) the combinatorial space of perturbation strengths and the oscillation phases causing collapse to the quiescent state. This indicates reduced (enlarged) basin of attraction, viz. the null space, associated with the steady state in the HH phase space. The findings bear important implications to the spiking dynamics of diverse interneurons, as well as special cases of pyramidal neurons, coupled through electrical synapses via. gap junctions, and suggest the role of gap junction plasticity in tuning vulnerability to quiescent state in the presence of biological noise and spikelets.
Sohn, K.; Yoon, D.; Lee, J.; Choi, S.
Show abstract
The claustrum, with its extensive reciprocal connections to nearly all cortical regions, has long been hypothesized as a key hub for integrating diverse cognitive, sensory and motor information. However, despite its anatomical connectivity, whether and how it functionally integrates different inputs to generate coherent representations has remained unclear. Here, we developed a recurrent neural network (RNN) trained via supervised learning on behavioral metrics of delayed escape-a behavioral paradigm that requires integration of temporally separated task-relevant signals. A subset of RNN neurons exhibited dynamics similar to those of anterior claustral neurons during this behavior. These neurons formed a recurrent cluster, a structure supported by in vitro stimulation experiments in claustral brain slices. We analyzed the computational properties of this claustrum-like cluster via dimensionality reduction of population activity. The network showed nonlinear integration of temporally distributed inputs and increased synergistic information. Rather than settling into attractors, integrated information was dynamically encoded along continuously evolving neural trajectories. Notably, similar trajectory patterns associated with dynamic integration were observed in claustral recordings, suggesting the model's biological plausibility. We propose that the anterior claustrum dynamically integrates task-relevant input signals over time and broadcasts the evolving representation to downstream brain regions capable of reading and interpreting it in a context-dependent manner.
Santhosh, A.; Narayanan, R.
Show abstract
Artificial recurrent networks are powerful models for studying neural dynamics and representations underlying complex cognitive tasks. However, the impact of neural-circuit heterogeneities on learning, dynamics, robustness, and generalization in these networks remains poorly understood. Here, we systematically investigated the impact of graded intrinsic heterogeneities in artificial recurrent networks trained on different cognitive tasks using reward- modulated Hebbian learning. Across networks trained with distinct hyperparameters and different levels of intrinsic heterogeneity, we observed pronounced network-to-network and task-to-task variability in training convergence, error dynamics during training, and task performance. These effects were strongly task dependent, with memory-dependent tasks exhibiting greater sensitivity to heterogeneity than memoryless tasks. We assessed these networks for robustness to multiple forms of graded post-training perturbations. Perturbations to intrinsic time constant distributions altered network dynamics, but had limited impact on final task accuracy in most cases. In contrast, perturbations to initial conditions, exploratory activity impulses, or task epoch durations strongly affected memory-dependent tasks. Among all perturbations, synaptic jitter was consistently the most detrimental, impairing performance across all tasks and heterogeneity levels. Importantly, despite such pronounced impact of heterogeneities, none of the metrics (spanning training, performance, dynamics, and robustness) varied monotonically with the level of training heterogeneity, instead showing additional dependencies on task demands, network configuration, and perturbation type. Finally, networks trained on a single task were able to perform structurally related untrained tasks, but failed on fundamentally distinct tasks. Strikingly, similar task performances emerged from divergent activity trajectories across networks and training conditions, together revealing pronounced functional degeneracy in network dynamics. Collectively, our findings establish that heterogeneous recurrent networks operate in a complex systems regime, where robust function emerges from non-unique, task-specific interactions among hyperparameters, dynamics, and heterogeneities. Our analyses emphasize the need for population- of-networks approaches that focus on interactions among multiple forms of neural heterogeneities in shaping learning and computation.
Leites, F. L.; Herbert, C. T.; Boari, S.; Cignoli, F. I.; Mindlin, G. B.; Amador, A.
Show abstract
Birdsong is a complex learned behavior that requires millisecond-scale precision in the coordinated activation of respiratory and vocal muscles to generate sound. Canary song consists of sequences of syllables organized into phrases, in which each syllable type is repeated at a characteristic rate, giving rise to a well-defined rhythmic vocal behavior. Here, we analyze neural population activity in the telencephalic song system nucleus HVC of singing adult male canaries (Serinus canaria), in relation to both vocal output and the underlying respiratory motor gestures. To uncover structure in these high-dimensional neural recordings, we used an unsupervised autoencoder. We found that a three-dimensional latent space was sufficient to reconstruct the data with minimal information loss, revealing a low-dimensional representation of HVC population activity. The oscillation frequencies of the latent modes closely matched both the syllabic repetition rate and the corresponding respiratory motor patterns. These results show that multiunit activity in HVC captures key rhythmic features of song at the population level, providing a dynamical representation of behaviorally relevant motor structure. More broadly, our findings highlight how data-driven dimensionality reduction can reveal structured, low-dimensional neural dynamics underlying complex learned motor behaviors.
Vloeberghs, R.; Tuerlinckx, F.; Urai, A. E.; Desender, K.
Show abstract
A widely used framework for studying the computational mechanisms of decision making is the Drift Diffusion Model (DDM). To account for the presence of both fast and slow errors in empirical data, the DDM incorporates across-trial variability in parameters such as the drift rate and the starting point. Although these variability parameters enable the model to reproduce both fast and slow errors, they rely on the assumption that over trials each parameter is independently sampled. As a result, the DDM effectively predicts that errors-- whether fast or slow--occur randomly over time. However, in empirical data this assumption is violated, as error responses are often temporally clustered. To address this limitation, we introduce the autocorrelated DDM, in which trial-to-trial fluctuations in drift rate, starting point, and boundary evolve according to first-order autoregressive (AR1) processes. Using simulations, we demonstrate that, unlike the across-trial variability DDM, the autocorrelated DDM naturally accounts for temporal clustering of errors. We further show that model parameters can be reliably recovered using Amortized Bayesian Inference, even with as few as 500 trials. Finally, fits to empirical data indicate that the autocorrelated DDM provides the best account of error clustering, highlighting that computational parameters fluctuate over time, despite typically being estimated as fixed across trials.
Choi, J. D.; Kumar, V.
Show abstract
1Markerless pose estimation has emerged as a powerful technique for animal behavior quantification, capable of high resolution tracking of body parts. Many neuroscience labs rely on tools like DeepLabCut and SLEAP, which provide accessible interfaces but restrict users to a narrow set of models and configurations. In this work, we adopt MMPose an open source, general-purpose computer vision library to build a workflow for training and evaluating multiple state-of-the-art models on animal video datasets. We benchmark these models in two scenarios: (1) a complex maze assay with occlusions and varied backgrounds, and (2) a simpler open field arena with a high-contrast background. Our results show that a bottomup model (DEKR) delivers the highest accuracy in the complex task, whereas lighter-weight models (e.g., SLEAP) offer superior speed highlighting a clear trade-off between accuracy and throughput. We also evaluate a recently published foundation model (TopViewMouse-5K) trained on a large top-view mouse dataset to test its generalization. It performs poorly on our tasks at zero-shot, and even when we combine its data with our training set, we observe no consistent benefit. These findings emphasize the importance of context-specific model selection and the need for more diverse training data to create generalizable pose models. By leveraging a general-purpose vision library, researchers can flexibly choose models that best suit their experimental needs. This work illustrates how adopting advanced computer vision frameworks can accelerate behavioral neuroscience and genetics research, paving the way for more scalable, reproducible, and sensitive analysis of animal behavior.
Diekmann, N.; Lissek, S.; Uengoer, M.; Cheng, S.
Show abstract
The progress of learning is usually quantified by averaging responses across participants and/or multiple trials within a block. However, such approaches obscure the trial-by-trial progress of learning, which has been shown recently to express a rich variety of dynamics. An alternative approach which does not suffer from this problem is the detection and analysis of points of behavioral change, i.e., change-point analysis. Using change-point analysis, we reanalyzed data from human participants in different predictive learning tasks in which learned contingencies underwent reversal. We find that responses of individual participants were more accurately characterized by behavioral change points than the average learning curve. Importantly, change points significantly shifted to later trials during reversal learning indicating that reversal learning is more difficult than the initial learning. In a computational model based on deep reinforcement learning, we show that the change point shift required the replay of previous experiences, which in turn depends on the hippocampus. This finding is consistent with studies showing that lesions of the hippocampus yield faster reversal learning. In summary, we reaffirm the importance of the analysis of single participant responses, show that phenomenological learning rates are slower during reversal learning, and provide a theoretical account for this difference.
Gambrell, O.; Singh, A.
Show abstract
A key component of intraneuronal communication is the modulation of postsynaptic firing frequencies by stochastic transmitter release from presynaptic neurons. The time interval between successive postsynaptic firings is called the inter-spike interval (ISI), and understanding its statistics is integral to neural information processing. We start with a model of an excitatory chemical synapse with postsynaptic neuron firing governed as per a classical integrate-and-fire model. Using a first-passage time framework, we derive exact analytical results for the ISI statistical moments, revealing parameter regimes driving precision in postsynaptic action potential timing. Next, we extended this analysis to include both an excitatory and an inhibitory presynaptic connection onto the same postsynaptic neuron. We consider both a fixed postsynaptic-firing threshold and a threshold that adapts based on the postsynaptic membrane potential history. Our analysis shows that the latter adaptive threshold can result in scenarios where increasing the inhibitory input frequency increases the postsynaptic firing frequency. Moreover, we characterize parameter regimes where ISI noise is hypo-exponential or hyperexponential based on its coefficient of variation being less than or higher than one, respectively.
Schmitt, F. J.; Müller, F. L.; Nawrot, M. P.
Show abstract
Neural population activity typically evolves on low-dimensional manifolds and can be described as trajectories in attractor-like state spaces, including metastable switching among quasi-stable assembly states. Here we develop a unified definition of clustered neural networks with local excitatory-inhibitory balance in which enhanced within-cluster effective coupling can be realized by connection probability (structural clustering), synaptic efficacy (weight clustering), or any mixture of both. We introduce a single mixing parameter{kappa} [isin] [0, 1] that redistributes a defined clustering contrast between connection probabilities and synaptic efficacies while preserving the mean input of a balanced random network. Using mean-field theory and network simulations, we show that metastable dynamics are supported across the full{kappa} continuum. Shifting contrast between structural and weight clustering changes higher-order input structure, reshaping multistable regimes, neuronal correlations, and the balance between single- and multi-cluster episodes. Because real nervous systems jointly organize topology and synaptic strength, our approach provides a biologically realistic assembly definition and a basis for future models combining structural and functional plasticity. In practical terms,{kappa} offers a translation axis for neuromorphic and other constrained substrates, clarifying trade-offs between routing resources and synaptic weight resolution when implementing attractor-based computational primitives such as winner-take-all decisions and working-memory states for artificial agents.
Grandchamp des Raux, H.; Ghilardi, T.; Ferre, E. R.; Ossmy, O.
Show abstract
A critical aspect of human cognition is the ability to use our knowledge about the laws of physics to make predictions about physical events. Whether this ability is based on abstract processes or is grounded in our body-environment interactions remains an open debate. We used physical reasoning under altered gravity as a model system to show that humans real-time embodied experience modifies their high-level physical reasoning. Specifically, we tested participants in computerised reasoning games, while disrupting their gravitational signalling using Galvanic Vestibular Stimulation (GVS). Participants failed more and had suboptimal strategies under the GVS condition compared to no-GVS in games requiring reasoning about terrestrial gravity. However, the effects of GVS were reduced when the games included reasoning about altered gravity. Our findings demonstrate how the physical experience of the body shifts high-level cognitive skill as reasoning, suggesting that humans mental representation of the world is grounded in adaptable physical mechanisms.